27 research outputs found
Provenance in bioinformatics workflows
In this work, we used the PROV-DM model to manage data provenance in workflows of genome projects. This provenance model allows the storage of details of one workflow execution, e.g., raw and produced data and computational tools, their versions and parameters. Using this model, biologists can access details of one particular execution of a workflow, compare results produced by different executions, and plan new experiments more efficiently. In addition to this, a provenance simulator was created, which facilitates the inclusion of provenance data of one genome project workflow execution. Finally, we discuss one case study, which aims to identify genes involved in specific metabolic pathways of Bacillus cereus, as well as to compare this isolate with other phylogenetic related bacteria from the Bacillus group. B. cereus is an extremophilic bacteria, collectemd in warm water in the Midwestern Region of Brazil, its DNA samples having been sequenced with an NGS machine
An analysis of cosmological perturbations in hydrodynamical and field representations
Density fluctuations of fluids with negative pressure exhibit decreasing time
behaviour in the long wavelength limit, but are strongly unstable in the small
wavelength limit when a hydrodynamical approach is used. On the other hand, the
corresponding gravitational waves are well behaved. We verify that the
instabilities present in density fluctuations are due essentially to the
hydrodynamical representation; if we turn to a field representation that lead
to the same background behaviour, the instabilities are no more present. In the
long wavelength limit, both approachs give the same results. We show also that
this inequivalence between background and perturbative level is a feature of
negative pressure fluid. When the fluid has positive pressure, the
hydrodynamical representation leads to the same behaviour as the field
representation both at the background and perturbative levels.Comment: Latex file, 18 page
PlantRNA_sniffer : a SVM-based workflow to predict long intergenic non-coding RNAs in plants
Non-coding RNAs (ncRNAs) constitute an important set of transcripts produced in the
cells of organisms. Among them, there is a large amount of a particular class of long ncRNAs that are
difficult to predict, the so-called long intergenic ncRNAs (lincRNAs), which might play essential roles
in gene regulation and other cellular processes. Despite the importance of these lincRNAs, there is
still a lack of biological knowledge and, currently, the few computational methods considered are so
specific that they cannot be successfully applied to other species different from those that they have
been originally designed to. Prediction of lncRNAs have been performed with machine learning
techniques. Particularly, for lincRNA prediction, supervised learning methods have been explored
in recent literature. As far as we know, there are no methods nor workflows specially designed to
predict lincRNAs in plants. In this context, this work proposes a workflow to predict lincRNAs on
plants, considering a workflow that includes known bioinformatics tools together with machine
learning techniques, here a support vector machine (SVM). We discuss two case studies that allowed
to identify novel lincRNAs, in sugarcane (Saccharum spp.) and in maize (Zea mays). From the results,
we also could identify differentially-expressed lincRNAs in sugarcane and maize plants submitted to
pathogenic and beneficial microorganisms
Roles of non-coding RNA in sugarcane-microbe interaction
Studies have highlighted the importance of non-coding RNA regulation in plant-microbe
interaction. However, the roles of sugarcane microRNAs (miRNAs) in the regulation of disease
responses have not been investigated. Firstly, we screened the sRNA transcriptome of sugarcane
infected with Acidovorax avenae. Conserved and novel miRNAs were identified. Additionally,
small interfering RNAs (siRNAs) were aligned to differentially expressed sequences from the
sugarcane transcriptome. Interestingly, many siRNAs aligned to a transcript encoding a coppertransporter
gene whose expression was induced in the presence of A. avenae, while the siRNAs were
repressed in the presence of A. avenae. Moreover, a long intergenic non-coding RNA was identified
as a potential target or decoy of miR408. To extend the bioinformatics analysis, we carried out
independent inoculations and the expression patterns of six miRNAs were validated by quantitative
reverse transcription-PCR (qRT-PCR). Among these miRNAs, miR408—a copper- microRNA—was
downregulated. The cleavage of a putative miR408 target, a laccase, was confirmed by a modified
50RACE (rapid amplification of cDNA ends) assay. MiR408 was also downregulated in samples
infected with other pathogens, but it was upregulated in the presence of a beneficial diazotrophic
bacteria. Our results suggest that regulation by miR408 is important in sugarcane sensing whether
microorganisms are either pathogenic or beneficial, triggering specific miRNA-mediated regulatory
mechanisms accordingly
Los elementos subjetivos del tipo legal [2.ed.]
Divulgação dos SUMÁRIOS das obras recentemente incorporadas ao acervo da Biblioteca
Ministro Oscar Saraiva do STJ. Em respeito à lei de Direitos Autorais, não disponibilizamos a
obra na íntegra.Localização na estante: 343(82) P766i 2.ed
A Probabilistic View of Datalog Parallelization
We explore an approach to developing Datalog parallelization strategies that aims at good expected rather than worst-case performance. To illustrate, we consider a very simple parallelization strategy that applies to all Datalog programs. We prove that this has very good expected performance under equal distribution of inputs. This is done using an extension of 0-1 laws adapted to this context. The analysis is confirmed by experimental results on randomly generated data. 1 Introduction The performance requirements of databases for advanced applications, and the increased availability of cheap parallel processing, have naturally lend great importance to the development of parallel processing techniques for databases. Much of the existing research in this direction has focused on parallelization of Datalog queries. In this paper we investigate parallel processing of Datalog from a probabilistic viewpoint. In contrast to existing work, we propose to guide the design and evaluation of para..
An Object-Oriented Framework for the Parallel Join Operation
We propose an object-oriented framework for one of the most frequent and costly operations in parallel database systems: the parallel join. The framework independently captures a great variety of parameters, such as different load balancing procedures and different synchronization disciplines. The framework addresses DBMS flexibility, configuration and extensibility issues, via the instantiation of known algorithms and facilities for the introduction of new ones. The framework can also be used to compare algorithms and to determine the execution scenario an algorithm is best suited for. Related algorithms are grouped in families, suggesting a taxonomy. 1. Introduction Since the introduction of parallel processing architectures, a great deal of effort has been spent on the development of algorithms for relational operators that support intra-operation parallelism -- one of the available techniques for DBMS parallelization [1]. Particularly, the join operator gained much attention due ..
Index Self-tuning with Agent-based Databases
The use of software agents as Database Management System components lead to database systems that may be configured and extended to support new requirements. We focus here with the self-tuning feature, which demands a somewhat intelligent behavior that agents could add to traditional DBMS modules. We propose in this paper an agent-based database architecture to deal with automatic index creation. Implementation issues are also discussed, for a built-in agents and DBMS integration architecture